Methods to coordinate the execution of workflow replicas in a distributed environment
نویسنده
چکیده
In many distributed systems robustness is a major concern since network nodes might fail spontaneously. Such failures can be a major problem when a workflow needs to be run on a set of unreliable and distributed nodes. Replication is a widely used architecture paradigm to increase system reliability. This thesis addresses robustness of workflow execution in distributed systems using replication. A workflow is a plan that describes how a number of tasks needs to be executed and is defined using a workflow definition language. A task is the description of a single operation contained within such a workflow. The first part of this thesis gives an overview over available workflow definition languages focusing mainly on declarative ones. Task types present in this languages are identified and their impact on replication is evaluated. A basic problem of workflow replication is that not all tasks within a workflow can be executed arbitrarily often. To solve this issue the execution of such tasks must be coordinated. The goal of this thesis is to propose and evaluate methods to coordinate the execution of such workflow replicas in a distributed environment. The proposed replica coordination algorithms are implemented as a peer to peer protocol and simulated using the peer to peer simulator PeerSim. A synthetic workflow generator is used to provide a large number of workflows for evaluation to test the performance, scalability and robustness under different conditions. The evaluation is concluded with the replication of a real workflow to judge the significance of the synthetic tests to the real world.
منابع مشابه
Multi-objective and Scalable Heuristic Algorithm for Workflow Task Scheduling in Utility Grids
To use services transparently in a distributed environment, the Utility Grids develop a cyber-infrastructure. The parameters of the Quality of Service such as the allocation-cost and makespan have to be dealt with in order to schedule workflow application tasks in the Utility Grids. Optimization of both target parameters above is a challenge in a distributed environment and may conflict one an...
متن کاملA Clustering Approach to Scientific Workflow Scheduling on the Cloud with Deadline and Cost Constraints
One of the main features of High Throughput Computing systems is the availability of high power processing resources. Cloud Computing systems can offer these features through concepts like Pay-Per-Use and Quality of Service (QoS) over the Internet. Many applications in Cloud computing are represented by workflows. Quality of Service is one of the most important challenges in the context of sche...
متن کاملArchitectural Plan for Constructing Fault Tolerable Workflow Engines Based on Grid Service
In this paper the design and implementation of fault tolerable architecture for scientific workflow engines is presented. The engines are assumed to be implemented as composite web services. Current architectures for workflow engines do not make any considerations for substituting faulty web services with correct ones at run time. The difficulty is to rollback the execution state of the workflo...
متن کاملArchitectural Plan for Constructing Fault Tolerable Workflow Engines Based on Grid Service
In this paper the design and implementation of fault tolerable architecture for scientific workflow engines is presented. The engines are assumed to be implemented as composite web services. Current architectures for workflow engines do not make any considerations for substituting faulty web services with correct ones at run time. The difficulty is to rollback the execution state of the workflo...
متن کاملImproving Data Availability Using Combined Replication Strategy in Cloud Environment
As grow as the data-intensive applications in cloud computing day after day, data popularity in this environment becomes critical and important. Hence to improve data availability and efficient accesses to popular data, replication algorithms are now widely used in distributed systems. However, most of them only replicate the static number of replicas on some requested chosen sites and it is ob...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013